Interpolated Dirichlet Class Language Model for Speech Recognition Incorporating Long-distance N-grams

نویسندگان

Md. Akmal Haidar

Douglas D. O'Shaughnessy

چکیده

We propose a language modeling (LM) approach incorporating interpolated distanced n-grams in a Dirichlet class language model (DCLM) (Chien and Chueh, 2011) for speech recognition. The DCLM relaxes the bag-of-words assumption and documents topic extraction of latent Dirichlet allocation (LDA). The latent variable of DCLM reflects the class information of an n-gram event rather than the topic in LDA. The DCLM model uses default background n-grams where class information is extracted from the (n-1) history words through Dirichlet distribution in calculating n-gram probabilities. The model does not capture the long-range information from outside of the n-gram window that can improve the language modeling performance. In this paper, we present an interpolated DCLM (IDCLM) by using different distanced n-grams. Here, the class information is exploited from (n-1) history words through the Dirichlet distribution using interpolated distanced n-grams. A variational Bayesian procedure is introduced to estimate the IDCLM parameters. We carried out experiments on a continuous speech recognition (CSR) task using the Wall Street Journal (WSJ) corpus. The proposed approach shows significant perplexity and word error rate (WER) reductions over the other approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fitting long-range information using interpolated distanced n-grams and cache models into a latent dirichlet language model for speech recognition

We propose a language modeling (LM) approach using interpolated distanced n-grams into a latent Dirichlet language model (LDLM) [1] for speech recognition. The LDLM relaxes the bag-of-words assumption and document topic extraction of latent Dirichlet allocation (LDA). It uses default background ngrams where topic information is extracted from the (n-1) history words through Dirichlet distributi...

متن کامل

An automatic acquisition method of statistic finite-state automaton for sentences

Statistic language models obtained from a large number of training samples play an important role in speech recognition. In order to obtain higher recognition performance, we should introduce long distance correlations between words. However, traditional statistic language models such as word n-grams and ergodic HMMs are insufficient for expressing long distance correlations between words. In t...

متن کامل

A Fast Re-scoring Strategy to Capture Long-Distance Dependencies

A re-scoring strategy is proposed that makes it feasible to capture more long-distance dependencies in the natural language. Two pass strategies have become popular in a number of recognition tasks such as ASR (automatic speech recognition), MT (machine translation) and OCR (optical character recognition). The first pass typically applies a weak language model (n-grams) to a lattice and the sec...

متن کامل

Beyond N-Grams: Can Linguistic Sophistication Improve Language Modeling?

It seems obvious that a successful model of natural language would incorporate a great deal of both linguistic and world knowledge. Interestingly, state of the art language models for speech recognition are based on a very crude linguistic model, namely conditioning the probability of a word on a small fixed number of preceding words. Despite many attempts to incorporate more sophisticated info...

متن کامل

Placing structuring elements in a word sequence for generating new statistical language models

Class based n-gram language models have been applied successfully in speech technology. We will present an automatic method to improve n-gram language models by distributing structural elements in a new way in word sequences. Our algorithm works on textual data consisting of two different kinds of text elements, namely words and structural elements. The order of words will not be changed during...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Interpolated Dirichlet Class Language Model for Speech Recognition Incorporating Long-distance N-grams

نویسندگان

چکیده

منابع مشابه

Fitting long-range information using interpolated distanced n-grams and cache models into a latent dirichlet language model for speech recognition

An automatic acquisition method of statistic finite-state automaton for sentences

A Fast Re-scoring Strategy to Capture Long-Distance Dependencies

Beyond N-Grams: Can Linguistic Sophistication Improve Language Modeling?

Placing structuring elements in a word sequence for generating new statistical language models

عنوان ژورنال:

اشتراک گذاری